1 Introduction

From the Binary.com Interview Q1 or (Alternate link) and also Binary.com Interview Q1 (Extention) or (Alternate link). Both papers test the accuracy of various statistical models and gain a high ROI per annum. However, as stated in the paper there has a concern which is don’t know highest or lowest price came first, therefore the paper compared all possible outcomes. Finally the Hi-Lo and Lo-Hi models made highest returns. The research on this paper will be applicable to the real-life.

In order to test the timeline of daily highest and lowest price when I am writing Real Time Trading System (Trial), here I created this file to read the tick-data-history to test the ROI (Return On Investment) per annum. Kindly refer to section Reference for further information.

2 Data

2.1 Read Data

I use more than 3 years data (from week 1 2015 until week 27 2018)1 for the question as experiment, 1st year data is burn-in data for statistical modelling and prediction purpose while following 2 years data for forecasting and staking. There have 52 trading weeks within a year.

There will be a certain spread charged by operators but the OHLC dataset does not provide the information for the course of the exchange rate. The tick-data history will be similar with rebirth model in soccer betting for normal FOREX trading market. The financial betting market similar with pre-match soccer betting.

I gathered the 3 datasets from below websites:

  • 1st Dataset - quantmod::getSymbols(src = 'yahoo'): which contains OHLCV data price (timezone in GMT). The place orders function required highest or lowest price come first, therefore I gathered the data via FXCMTickData.
  • 2nd Dataset - FXCMTickData : which contain the timeline of highest and lowest price within a day (timezone in UTC). The daily dataset used for forecast daily price.
  • 3rd Dataset - TFX::queryFX(): which shows the current price. The place orders function required data history, therefore I gathered the data via FXCMTickData.

There will probably occurs inconsistancy of data price among 3 datasets, however there will be cost few years time to gather all real time price via TFX::queryFX(). Otherwise all data gather via 1 channel (queryFX()) will be perfect. However, you can feel free to read futher in this paper where verified the consistancy of datasets.

The will be another research project for Real Time High Frequency Trading where collect the real-time data and also high-frquency trading for tick-data. You are feel free to browse over Real Time FXCM.

Below is the dataset gather via getSymbols(src = 'yahoo').

## read saved dataset.
mbase <- readRDS('./data/USDJPY/USDJPY.rds')
## [1] "mbase : [911 x 6]"
Category n
Index Min. :2015-01-05
Index 1st Qu.:2015-11-19
Index Median :2016-10-03
Index Mean :2016-10-03
Index 3rd Qu.:2017-08-18
Index Max. :2018-07-05
USDJPY.Open Min. : 99.89
USDJPY.Open 1st Qu.:109.33
USDJPY.Open Median :112.59
USDJPY.Open Mean :113.21
USDJPY.Open 3rd Qu.:118.97
USDJPY.Open Max. :125.60
USDJPY.High Min. :100.4
USDJPY.High 1st Qu.:109.6
USDJPY.High Median :112.9
USDJPY.High Mean :113.6
USDJPY.High 3rd Qu.:119.4
USDJPY.High Max. :125.8
USDJPY.Low Min. : 99.57
USDJPY.Low 1st Qu.:108.90
USDJPY.Low Median :112.15
USDJPY.Low Mean :112.79
USDJPY.Low 3rd Qu.:118.67
USDJPY.Low Max. :124.97
USDJPY.Close Min. : 99.91
USDJPY.Close 1st Qu.:109.32
USDJPY.Close Median :112.58
USDJPY.Close Mean :113.21
USDJPY.Close 3rd Qu.:118.97
USDJPY.Close Max. :125.63
USDJPY.Volume Min. :0
USDJPY.Volume 1st Qu.:0
USDJPY.Volume Median :0
USDJPY.Volume Mean :0
USDJPY.Volume 3rd Qu.:0
USDJPY.Volume Max. :0
USDJPY.Adjusted Min. : 99.91
USDJPY.Adjusted 1st Qu.:109.32
USDJPY.Adjusted Median :112.58
USDJPY.Adjusted Mean :113.21
USDJPY.Adjusted 3rd Qu.:118.97
USDJPY.Adjusted Max. :125.63

Table 2.1.1 : 1st dataset - summary of daily price dataset.

2.2 Tidy Data

For 2nd dataset, here I gather the tick data via FXCMTickData, there are more than million rows dataset (million rows per file while there are 52 files over 52 weeks) while I tidy and filter only highest and lowest bid/ask price. From the table below we can know the timeline which is weather highest or lowest price came first.

## read tick data.
HL_tick_data <- read_HL_tick_data()
dtID <- unique(HL_tick_data$Date)

## arrange the sequence of highest and lowest price.
HL_tick_data %<>% tbl_df %>% mutate(sq = rep(1:4, length(dtID)))

## Print dataset
HL_tick_data
## # A tibble: 4,228 x 5
##    Date       DateTime              Bid   Ask    sq
##    <date>     <dttm>              <dbl> <dbl> <int>
##  1 2014-12-28 2014-12-28 22:00:00  120.   NA      1
##  2 2014-12-28 2014-12-28 22:36:35   NA   120.     2
##  3 2014-12-28 2014-12-28 22:52:18   NA   120.     3
##  4 2014-12-28 2014-12-28 23:42:36  120.   NA      4
##  5 2014-12-29 2014-12-29 05:10:03   NA   120.     1
##  6 2014-12-29 2014-12-29 05:10:04  120.   NA      2
##  7 2014-12-29 2014-12-29 19:18:16   NA   121.     3
##  8 2014-12-29 2014-12-29 19:36:59  121.   NA      4
##  9 2014-12-30 2014-12-30 00:02:23   NA   121.     1
## 10 2014-12-30 2014-12-30 00:29:35  121.   NA      2
## # ... with 4,218 more rows

Table 2.2.1 : 2nd dataset - daily high-low price tick-data.

For 3rd dataset, due to there is a real-time data, the Real Time Trading System (Trial) web application only stored the real-time bid/ask transaction price but not collect all tick-data in seconds. However, you are feel free to gather from DataCollection.

3 Statistical Modelling

3.1 Forecast Daily Hi-Lo Price

I tried to apply Lasso, Elastic Net and Ridge models to test the accuracy of prediction via shinyApp.

\[\begin{equation} \sigma^2_{t} = \omega + \sum_{i=1}^{\rho}(\alpha_{i} + \gamma_{i} I_{t-i}) \varepsilon_{t-i}^{2} + \sum_{j=1}^{q}\beta_{j}\sigma^{2}_{t-j}\ \cdots\ Equation\ 3.1.1 \end{equation}\]

Here I directly apply GJR-GARCH2 model due to I had compared few statistical models and got the best fitted model. Kindly refer to Binary.com Interview Q1 for the paper.

  • Auto Arima models (Adjusted and use the optimal AR and MA parameters)
  • Exponential Time Series (27 ETS models)
  • Univariate Garch models (GARCH, T-GARCH, GJR-GARCH, eGARCH, etc altogather 12 models.)
  • Exponential Weighted Moving Average
  • Monte Carlo Markov Chain
  • Bayesian Time Series
  • Midas

Here I wrote another extention page for Q1 which is analyse the multiple currencies and also models from minutes to daily. You are feel free to browse over Binary.com Interview Q1 (Extention). The paper compare and get the optimal predictive model based on the various number of observations.

3.2 ARMA Order

Here I read my saved dataset where forecast 1 trading day advanced for daily Hi-Lo price. Kindly refer to Real Time Trading System (Trial) for more information.

##      LatestDate.GMT Lst.Open Lst.High Lst.Low Lst.Close ForecastDate.GMT
##   1:     2016-01-04  120.317  120.448 118.720   120.311              T+1
##   2:     2016-01-05  119.474  119.680 118.801   119.467              T+1
##   3:     2016-01-06  119.100  119.150 118.260   119.102              T+1
##   4:     2016-01-07  118.609  118.753 117.364   118.610              T+1
##   5:     2016-01-08  117.530  118.710 117.512   117.540              T+1
##  ---                                                                    
## 647:     2018-06-28  110.442  110.871 110.388   110.486              T+1
## 648:     2018-07-01  110.745  111.053 110.606   110.710              T+1
## 649:     2018-07-02  110.887  111.126 110.510   110.871              T+1
## 650:     2018-07-03  110.400  110.546 110.282   110.408              T+1
## 651:     2018-07-04  110.497  110.703 110.295   110.502              T+1
##      Fct.Open Fct.High  Fct.Low Fct.Close
##   1: 120.4105 120.2905 119.2893  120.3278
##   2: 119.4793 120.1193 119.2821  119.4786
##   3: 119.0646 119.7417 119.2809  119.0649
##   4: 118.6654 119.4668 119.2840  118.7410
##   5: 117.5048 119.6253 119.2871  117.5493
##  ---                                     
## 647: 110.4233 110.9678 109.8558  110.4686
## 648: 110.8265 111.0962 110.5146  110.7449
## 649: 110.7930 110.9810 110.4011  110.7755
## 650: 110.4582 110.4743 110.0168  110.4741
## 651: 110.5188 110.7678 110.3163  110.5095

Table 3.1.1 : Forecast high-low daily price.

pred.data %>% dplyr::filter(LatestDate.GMT == '2017-01-12') %>% 
  kable %>% 
  kable_styling(bootstrap_options = c('striped', 'hover', 'condensed', 'responsive')) %>%
  scroll_box(width = '100%')
LatestDate.GMT Lst.Open Lst.High Lst.Low Lst.Close ForecastDate.GMT Fct.Open Fct.High Fct.Low Fct.Close
2017-01-12 115.055 115.219 113.76 NA T+1 115.1867 110.5123 113.6786 NA

Table 3.1.2 : Forecast daily price missing close price.

Due to the day 2017-01-12 unable forecast the closed price.3, here I omit the transaction on that day.

Below I combine the daily price dataset with forecast Hi-Lo price.

## filter data.
pred.data %<>% dplyr::filter(LatestDate.GMT != '2017-01-12')

## Copied dataset.
mbase <- data.frame(Date = index(mbase), mbase) %>% tbl_df
pred.data %<>% tbl_df

## Add `Date` column as forecasted Date.
pred.data$Date <- lead(pred.data$LatestDate.GMT)
pred.data$Date[length(pred.data$Date)] <- data.table::last(pred.data$LatestDate.GMT) + days(1)

pred.data %<>% select(Date, Fct.Open, Fct.High, Fct.Low, Fct.Close) %>% data.table

## Merge dataset.
pred <- merge(tbl_df(mbase), pred.data, by = 'Date') %>% 
  tbl_df %>% select(-USDJPY.Volume, -USDJPY.Adjusted)
rm(pred.data)

## Print dataset.
pred %>% data.table
##            Date USDJPY.Open USDJPY.High USDJPY.Low USDJPY.Close Fct.Open
##   1: 2016-01-05     119.474     119.680    118.801      119.467 120.4105
##   2: 2016-01-06     119.100     119.150    118.260      119.102 119.4793
##   3: 2016-01-07     118.609     118.753    117.364      118.610 119.0646
##   4: 2016-01-08     117.530     118.710    117.512      117.540 118.6654
##   5: 2016-01-11     117.073     117.992    117.009      117.080 117.5048
##  ---                                                                    
## 646: 2018-07-01     110.745     111.053    110.606      110.710 110.4233
## 647: 2018-07-02     110.887     111.126    110.510      110.871 110.8265
## 648: 2018-07-03     110.400     110.546    110.282      110.408 110.7930
## 649: 2018-07-04     110.497     110.703    110.295      110.502 110.4582
## 650: 2018-07-05     110.555     110.778    110.384      110.579 110.5188
##      Fct.High  Fct.Low Fct.Close
##   1: 120.2905 119.2893  120.3278
##   2: 120.1193 119.2821  119.4786
##   3: 119.7417 119.2809  119.0649
##   4: 119.4668 119.2840  118.7410
##   5: 119.6253 119.2871  117.5493
##  ---                            
## 646: 110.9678 109.8558  110.4686
## 647: 111.0962 110.5146  110.7449
## 648: 110.9810 110.4011  110.7755
## 649: 110.4743 110.0168  110.4741
## 650: 110.7678 110.3163  110.5095

Table 3.1.3 : Tidy dataset for forecast high-low daily price.

Below I test if the dataset scrapped from quantmod::getSymbols(src = 'yahoo') equal to FXCMTickData. Unfortunately the data gathered is not tally each other.

mb.dateID <- unique(mbase$Date)
td.dateID <- unique(HL_tick_data$Date)

## Check the start and end date
data.frame(MB = range(mb.dateID), 
           TD = range(td.dateID)) %>% 
  kable %>% 
  kable_styling(bootstrap_options = c('striped', 'hover', 'condensed', 'responsive'), full_width = FALSE, position = 'float_left') %>%
  footnote(general = 'Date range of 1st dataset and 2nd dataset.',
           general_title = 'Table 3.1.4 : ', footnote_as_chunk = TRUE)
MB TD
2015-01-05 2014-12-28
2018-07-05 2018-07-06
Table 3.1.4 : Date range of 1st dataset and 2nd dataset.

Table 3.1.4 : Date range of 1st dataset and 2nd dataset. at left-hand-side shows the date range for both MB (mbase dataset) and TD (tick-data).

Below shows all dates NOT in each dataset. Unfortunately the inconsistancy of dataset gathered from getSymbols(src = 'yahoo') will caused whole predictive models bias, it will affect the staking amount and evetually effect the ROI. The dataset used to working fine few years ago. There will be another research which gather only dataset from operator to solve the issue.

## Check if dateID not in another dataset.
mb.dateID[!mb.dateID %in% td.dateID]
##  [1] "2015-12-25" "2016-01-01" "2016-02-03" "2016-02-04" "2016-02-05"
##  [6] "2016-02-19" "2016-04-20" "2016-04-21" "2016-05-04" "2016-05-05"
## [11] "2016-11-17" "2016-11-18" "2016-12-26" "2017-01-02" "2017-04-17"
## [16] "2017-04-18" "2017-04-19" "2017-04-20" "2017-04-26" "2017-04-27"
## [21] "2017-05-16" "2017-05-17" "2017-05-18" "2017-05-23" "2017-05-24"
## [26] "2017-05-25" "2017-06-29" "2017-08-08" "2017-08-09" "2017-08-10"
## [31] "2017-12-25"
td.dateID[!td.dateID %in% mb.dateID]
##   [1] "2014-12-28" "2014-12-29" "2014-12-30" "2014-12-31" "2015-01-02"
##   [6] "2015-01-04" "2015-01-11" "2015-01-18" "2015-01-25" "2015-02-01"
##  [11] "2015-02-08" "2015-02-15" "2015-02-22" "2015-03-01" "2015-03-08"
##  [16] "2015-03-15" "2015-03-22" "2015-04-03" "2015-04-10" "2015-04-17"
##  [21] "2015-04-24" "2015-05-01" "2015-05-08" "2015-05-15" "2015-05-22"
##  [26] "2015-05-29" "2015-06-05" "2015-06-12" "2015-06-19" "2015-06-26"
##  [31] "2015-07-03" "2015-07-10" "2015-07-17" "2015-07-24" "2015-07-31"
##  [36] "2015-08-07" "2015-08-14" "2015-08-21" "2015-08-28" "2015-09-04"
##  [41] "2015-09-11" "2015-09-21" "2015-09-25" "2015-10-02" "2015-10-09"
##  [46] "2015-10-16" "2015-10-23" "2015-10-25" "2015-11-01" "2015-11-08"
##  [51] "2015-11-15" "2015-11-22" "2015-11-29" "2015-12-06" "2015-12-13"
##  [56] "2015-12-20" "2015-12-27" "2016-01-03" "2016-01-10" "2016-01-17"
##  [61] "2016-01-24" "2016-01-31" "2016-02-07" "2016-02-14" "2016-02-21"
##  [66] "2016-02-28" "2016-03-06" "2016-03-13" "2016-03-20" "2016-04-01"
##  [71] "2016-04-08" "2016-04-29" "2016-05-13" "2016-05-20" "2016-05-27"
##  [76] "2016-06-03" "2016-06-17" "2016-06-24" "2016-07-01" "2016-07-08"
##  [81] "2016-07-15" "2016-07-22" "2016-07-29" "2016-08-05" "2016-08-12"
##  [86] "2016-08-19" "2016-08-26" "2016-09-02" "2016-09-09" "2016-09-16"
##  [91] "2016-09-23" "2016-09-30" "2016-10-07" "2016-10-14" "2016-10-21"
##  [96] "2016-10-28" "2016-10-30" "2016-11-06" "2016-11-13" "2016-11-20"
## [101] "2016-11-27" "2016-12-04" "2016-12-11" "2016-12-18" "2017-01-08"
## [106] "2017-01-15" "2017-01-22" "2017-01-29" "2017-02-05" "2017-02-12"
## [111] "2017-02-19" "2017-02-26" "2017-03-05" "2017-03-12" "2017-03-19"
## [116] "2017-03-31" "2017-04-07" "2017-04-14" "2017-05-05" "2017-05-12"
## [121] "2017-06-02" "2017-06-09" "2017-06-16" "2017-06-23" "2017-07-07"
## [126] "2017-07-10" "2017-07-14" "2017-07-21" "2017-07-28" "2017-08-04"
## [131] "2017-08-18" "2017-08-25" "2017-09-01" "2017-09-08" "2017-09-15"
## [136] "2017-09-22" "2017-09-29" "2017-10-06" "2017-10-13" "2017-10-20"
## [141] "2017-10-27" "2017-10-29" "2017-11-05" "2017-11-12" "2017-11-16"
## [146] "2017-11-17" "2017-11-19" "2017-11-26" "2017-12-03" "2017-12-10"
## [151] "2017-12-17" "2018-01-07" "2018-01-14" "2018-01-21" "2018-01-28"
## [156] "2018-02-04" "2018-02-11" "2018-02-18" "2018-02-25" "2018-03-04"
## [161] "2018-03-11" "2018-03-18" "2018-03-30" "2018-04-06" "2018-04-13"
## [166] "2018-04-20" "2018-04-27" "2018-05-04" "2018-05-11" "2018-05-18"
## [171] "2018-05-25" "2018-06-01" "2018-06-08" "2018-06-15" "2018-06-22"
## [176] "2018-06-29" "2018-07-06"

Graph 3.2.1A : Forecast daily highest price vs real daily highest price.

Graph 3.2.1B : Forecast daily highest price vs real daily lowest price.

Graph 3.2.1A and Graph 3.2.1B above compare the real price and forecast price. Following section will be compare the MSE (Mean Squared Error).

3.3 Mean Squared Error

## Mean Squared Error : Comparison of accuracy.
## https://cran.r-project.org/web/packages/kableExtra/vignettes/awesome_table_in_html.html
data.frame(
  Category = c('High = ', 'Low = ', 'Close = '), 
  MSE = c(mean((pred$Fct.High - pred$USDJPY.High)^2), 
          mean((pred$Fct.Low - pred$USDJPY.Low)^2), 
          mean((pred$Fct.Close - pred$USDJPY.Close)^2))) %>% 
  kable %>% 
  kable_styling(bootstrap_options = c('striped', 'hover', 'condensed', 'responsive'), full_width = FALSE, position = 'float_right')
Category MSE
High = 7.747397e+00
Low = 2.978715e+00
Close = 5.109613e+45

\[\begin{equation} \frac{1}{n}\sum_{t=1}^{n}e_t^2\ \cdots\ Equation\ 3.2.1 \end{equation}\]

Table 3.2.1 : Mean Squared Error high-low daily price. at the right-hand-side shows the accuracy of the predictive model. You can also refer to previous studies where compare the accuracy of predicted Open, High, Low and Close price.

4 Betting Strategy

4.1 Kelly Criterion

\[\begin{equation} f = \frac{Edge}{Odds} = \frac{p^∗ x−(1−p^∗)}{x} \ \cdots Equation\ 4.1.1 \end{equation}\]

The section 1 inside Application of Kelly Criterion model in Sportsbook Investment shows the ROI for betting on sportsbook, where section 2 will be started during spared time after FOREX market.

For the betting strategy, here I need to do correction from Binary.com Interview Q1 as stated in section Introduction. Below I tidy the dataset to match the bid/ask price for close a transaction.

As we know from previous paper, the closing price will be settled price (closed transaction) if the forecast price has no occured within a day (from 12:00:01 AM until next day 12:00:00 AM, due to system calculate the price spend a minute, therefore the opening price will not in count.). There will be 2 limit orders (but and sell) placed after 12:00:00AM. Once one among the order limit stand (opened transaction), another limit order will be automatically turned to be closed transaction request. Therefore there will only transaction within a trading day unless no any order limit placed stand.

## Here I combine real-time dataset with the daily dataset.
pred <- merge(HL_tick_data, pred,  by = 'Date') %>% 
  tbl_df %>% 
  arrange(DateTime)

## Tidy dataset.
pred %<>% group_by(Date) %>% 
    dplyr::filter(Bid == min(Bid, na.rm = TRUE)|Ask == max(Ask, na.rm = TRUE)) %>% 
    dplyr::select(DateTime, Date, USDJPY.High, USDJPY.Low, USDJPY.Close, sq, Bid, Ask, Fct.High, Fct.Low, Fct.Close) %>% 
    rename(High = USDJPY.High, Low = USDJPY.Low, Close = USDJPY.Close) %>% tbl_df

pred %<>% mutate(
        Fct.High = round(Fct.High, 3), 
        Fct.Low = round(Fct.Low, 3), 
        Sell = ifelse(Fct.High <= High & Fct.High >= Low, 1, 0), 
        Buy = ifelse(Fct.Low >= Low & Fct.Low <= High, 1, 0))
    
## follow the seq to determine the buy or sell limit order stand within a day. Then the other side will be automatically switch to close transaction limit order but not placed another limit order.
pred <- ldply(split(pred, pred$Date), function(x) {
    x %<>% mutate(Trans = ifelse(!is.na(Bid), 'sell', 'buy'), 
                  Trans = ifelse(Sell == 1, 'sell', 
                          ifelse(Buy == 1, 'buy', 0)))
  
  if(x[1,]$Trans == 'sell'|x[1,]$Trans == 'buy') { #if open transaction.
    x[nrow(x),]$Trans <- 'close'                    #  then close transaction.
  }
  x
  }) %>% tbl_df

pred$.id <- NULL

[2.1.4 Staking Model] in Binary.com Interview Q1 states that the financial market unable to know the payout rate in advanced, therefore use the forecasted pice based on statistical models. I use the amount calculated by Kelly Criterion as the stakes (possible loss or terms as stop-loss in financial market), variance of forecasted Hi/Lo and also Closed price. However it doesn’t make sense in financial market.

Hedge Fund Market Wizards has a good discussion on this in Ed Thorp’s chapter.

  • Kelly Criterion has highest long-term growth rate, but gives you higher drawdowns and risk of ruin.
  • In gambling you know your theoretical odds, but in trading your win rate is only an estimate.
  • If you estimate your win rate incorrectly, the profits you’d miss out on by underestimating are less than the losses you’d incur by being overconfident.
  • If there’s large uncertainty about your win rate (e.g. trend following systems) Kelly may be inappropriate.
  • “Suppose you have 1MM and your max allowable drawdown is 200K - then from the Kelly perspective you don’t have 1MM, you have 200K in capital.”
  • If you bet .5 K.C. you get .75 of the returns with .5 of the volatility - half Kelly is better psychologically.

Source : Kelly Criterion in Forex primordia • Jul 22, 2017, 7:07 PM

EXAMPLE Let’s look at a trading example.

Say, you have a EURUSD trading strategy that wins approximately 70% of the time. The StopLoss in your strategy is 40 pips and the TakeProfit is 20 pips (spread accounted for).

This means that your B and P parameters are as follows:

B = 20 pips / 40 pips = 0.5 P = 70% = 0.7 Let’s input these values into Kelly’s formula and see what we get:

K = ( PxB – (1–P) ) / B K = ( 0.7 x 0.5 – (1–0.7) ) / 0.5 = 0.1

This means that the optimal risk for this trading strategy that will maximize your profits in the long term is 10%.

If you want to be a bit more conservative, then go with the Half-Kelly of 5%.

Whatever you do, don’t invest more than 10% per trade – it’s pointless.

If you invest more than 20% then you will turn this great strategy into one that will ruin your account.

That’s how you apply the Kelly Criterion in practice.

Source : ForexBoat:Kelly Criterion4

4.2 Staking Strategy

Chapter 20 Against the Odds: The Mathematics of Gambling5 elaborates the odds price, edge, Kelly Criterion staking model, portfolio, Entropy and also day trading.

\[\begin{eqnarray} g_t(f) &=& \frac{1}{t} \ln \left( \frac{B_t}{B_0} \right) \\ &=& \frac{1}{t} \ln \left( \prod_{i=1}^t [1+r +f(Z_t -r)\varepsilon] \right) \\ &=& \frac{1}{t} \sum_{i=1}^t \ln \left( [1+r +f(Z_t -r)\varepsilon] \right) \end{eqnarray} \ \cdots Equation\ 4.2.1 \]

By applying the law of central limit theorem for large data, here we get:

\[\begin{equation} g(f) = \lim_{t \rightarrow \infty} g_t(f) = E[\ln(1+r + f (Z-r)\varepsilon)] \ \cdots Equation\ 4.2.2 \end{equation}\]

and \(\varepsilon_{i}\) is a weight function where applied statistical models to forecast the price. I used some models and eventually concludes the GJR-GARCH model generated highest ROI. The weight function doesn’t same with different currencies since it using substraction among 2 forecast prices but not in ratio6. You might look at the Kelly adjusted model 5 and 6 in section Return of Investment.

\[\begin{equation} \varepsilon = h(x)_{1} - v \ \cdots Equation\ 4.2.3 \end{equation}\]

\(v\) is a switch function to determine the settled price.

\[\begin{equation} v \left\{\begin{matrix} h(x)_{2} & if(Lo <= h(x)_{2} <= Hi) & \\ Cl & otherwise \end{matrix}\right. \ \cdots Equation\ 4.2.4 \end{equation}\]

where \(Hi\), \(Lo\) and \(Cl\) are the daily highest, lowest and closed price.

Binary.com Interview Q1 applied a more sophisticated model as stated above to compares all possible outcomes of predicted price and ROI. Due to the financial betting only allows player place bets and awaiting for the settlement (unless placed another bet at other predicted price), there will be no any limit order and close transaction request (similar with FOREX trading market can place more than 1 limit order to lock the profit).

The staking model in previous papers (includes Binary.com Interview Q1 (Extention)) have only MISTAKE which is wrote for real FOREX trading market but not financial betting since I only think of spread betting but forgot the normal betting :

  • Forecast highest price to sell and forecast lowest price to buy to maximise the profit : I wrote 2nd forecast price as settled price if it was between the daily Hi-Lo range (otherwise daily closed price will be settled price). There will be only applicable to FOREX trading market but NOT financial betting market.
  • However it will be more easily since the settled price will be only daily closed price.

The paper Binary.com Interview Q1 compares all outcome :

  • Hi-Cl + Lo-Cl generate highest ROI in financial betting market.7
  • Hi-Lo or Lo-Hi will be best betting strategy for noarmal FOREX market.
.id StartDate LatestDate InitFund LatestFund Profit RR
fundAutoArimaHICL 2015-01-02 2017-01-20 1000 1401.694 401.6938 140.17%
fundAutoArimaLOCL 2015-01-02 2017-01-20 1000 1499.818 499.8177 149.98%
Combine 2015-01-02 2017-01-20 2000 2901.512 901.5115 145.08%

Table 4.2.1 : betting strategy for fiancial betting.

No               .id  StartDate LatestDate InitFund  LatesFund    Profit        RR
07 fundAutoArimaHILO 2015-01-02 2017-01-20     1000   1637.251 637.25113 163.7251%
10 fundAutoArimaLOHI 2015-01-02 2017-01-20     1000   1716.985 716.98492 171.6985%

Table 4.2.2 : betting strategy for normal FOREX market.

Kindly refer to 2.1.5 Return of Investment in binary Q1 for full table.

4.3 Optimal Edge

I don’t pretend to know the optimal edge for staking. Here I need to compare above models with normal Kelly model.

I used to use Edge1 = ifelse(fB1 > 0, B1, ifelse(fS1 > 0, S1, 0)) to measure the edge for staking while it might be wrong due to I put the edge for selling as secondary edge for Buy as well. Here I use Edge1a = ifelse(fB1 > 0, B1, 0) and Edge1b = ifelse(fS1 > 0, S1, 0) to seperates the edge for Buy and Sell. It means that the buy action will be primary and sell action will be secondary where the edge for both buy and sell will stand. Therefore most of the observation will overcame probabilities 0.5.

## http://srdas.github.io/MLBook/Gambling.html#simulation-of-the-betting-strategy
pred %>% 
    tbl_df %>% 
    select(-DateTime) %>% 
    mutate(
        Fct.High = round(Fct.High, 3), 
        Fct.Low = round(Fct.Low, 3), 
        Sell = ifelse(Fct.High <= High & Fct.High >= Low, 1, 0), 
        Buy = ifelse(Fct.Low >= Low & Fct.Low <= High, 1, 0))
## # A tibble: 1,242 x 13
##    Date        High   Low Close    sq   Bid   Ask Fct.High Fct.Low
##    <date>     <dbl> <dbl> <dbl> <int> <dbl> <dbl>    <dbl>   <dbl>
##  1 2016-01-05  120.  119.  119.     2   NA   120.     120.    119.
##  2 2016-01-05  120.  119.  119.     3  119.   NA      120.    119.
##  3 2016-01-06  119.  118.  119.     1   NA   119.     120.    119.
##  4 2016-01-06  119.  118.  119.     4  118.   NA      120.    119.
##  5 2016-01-07  119.  117.  119.     1   NA   119.     120.    119.
##  6 2016-01-07  119.  117.  119.     3  117.   NA      120.    119.
##  7 2016-01-08  119.  118.  118.     1   NA   119.     119.    119.
##  8 2016-01-08  119.  118.  118.     4  117.   NA      119.    119.
##  9 2016-01-11  118.  117.  117.     1  117.   NA      120.    119.
## 10 2016-01-11  118.  117.  117.     3   NA   118.     120.    119.
## # ... with 1,232 more rows, and 4 more variables: Fct.Close <dbl>,
## #   Sell <dbl>, Buy <dbl>, Trans <chr>

Table 4.3.1 : Buy-Long and Sell-Short table.

4.4 Application of Kelly Criterion to Normal FOREX Market

Previous paper using the forecast HiLo price and forecast closing price as settlement, the stakes will be the edge for pnorm(Hi, mean(Lo), sd(Lo)) and vice verse. It will be \(\frac{\sigma_{Hi}}{\sigma_{Lo}}\) or \(\frac{\sigma_{Lo}}{\sigma_{Hi}}\) but missing the difference of pips between buy/sell price and closed price.

In this paper will count the difference of pips and also leverage ratio. The risk management on leverage will be counted into the staking model.

4.5 Application of Kelly Criterion to Financial Betting Market

Previous paper use Kelly model to placed a certain amount and awaiting for settlement. The forecast closed price will be the settled price if it was within the range of HiLo in the day. There is not workable due to traders not allowed to placed an close transaction limit order in financial betting market.

Due to there has no dataset for financial betting, therefore I do not have the payout rate or odds price for minutely, hourly and daily betting. The sample dataset and research might refer to Application of Kelly Criterion model in Sportsbook Investment where collected odds price of 49 bookmakers and placed bets on 33 operators8.

I tune a bit in this paper which is set the forecast closed price cannot be settled price. There will be another research which is collect odds price from operators to test the ROI.

5 Return of Investment

5.1 Normal FOREX Market

sim_staking(pred) %>% 
  dplyr::select(Date, Edge1a, Edge1b, Edge2a, Edge2b, Buy, Sell, Trans, BR, Profit, Bal) %>% 
  data.table
##             Date   Edge1a   Edge1b   Edge2a Edge2b Buy Sell Trans       BR
##    1: 2016-01-05 0.000000 9.728813 9.844852      0   1    0   buy 10000.00
##    2: 2016-01-05 0.000000 9.728813 9.844852      0   0    0 close 10000.00
##    3: 2016-01-06 0.000000 9.728381 9.839840      0   0    0     0 10000.00
##    4: 2016-01-06 0.000000 9.728381 9.839840      0   0    0     0 10000.00
##    5: 2016-01-07 0.000000 9.728319 9.827605      0   0    0     0 10000.00
##   ---                                                                     
## 1238: 2018-07-03 9.391490 0.000000 9.391765      0   0    0 close 11798.03
## 1239: 2018-07-04 9.405308 0.000000 9.379727      0   0    1  sell 11798.03
## 1240: 2018-07-04 9.405308 0.000000 9.379727      0   0    0 close 11798.03
## 1241: 2018-07-05 9.394194 0.000000 9.385754      0   0    1  sell 11798.03
## 1242: 2018-07-05 9.394194 0.000000 9.385754      0   0    0 close 11798.03
##       Profit      Bal
##    1:      0 10000.00
##    2:      0 10000.00
##    3:      0 10000.00
##    4:      0 10000.00
##    5:      0 10000.00
##   ---                
## 1238:      0 13612.38
## 1239:      0 13612.38
## 1240:      0 13612.38
## 1241:      0 13612.38
## 1242:      0 13612.38

Table 5.1.1 : ROI for normal FOREX market.

Graph 5.1.1A : ROI for normal FOREX market. (None)

Graph 5.1.1B : ROI for normal FOREX market. (Mixed Kelly model)

Graph 5.1.1C : ROI for normal FOREX market. (Normal Kelly model)

Graph 5.1.1D : ROI for normal FOREX market. (Adjusted Kelly model 1)

Graph 5.1.1E : ROI for normal FOREX market. (Adjusted Kelly model 2)

Graph 5.1.1F : ROI for normal FOREX market. (Adjusted Kelly model 3)

Graph 5.1.1G : ROI for normal FOREX market. (Adjusted Kelly model 4)

Graph 5.1.1H : ROI for normal FOREX market. (Adjusted Kelly model 5)

Graph 5.1.1G : ROI for normal FOREX market. (Adjusted Kelly model 6)

5.2 Financial Betting Market

sim_staking(pred, financial_bet = TRUE) %>% 
  #mutate(Trans = ifelse(!is.na(Bid), 'sell', 'buy')) %>% 
  dplyr::select(Date, Edge1a, Edge1b, Edge2a, Edge2b, Buy, Sell, Trans, BR, Profit, Bal) %>% 
  data.table
##             Date Edge1a   Edge1b   Edge2a Edge2b Buy Sell Trans       BR
##    1: 2016-01-05      0 9.210853 9.210853      0   1    0   buy 10000.00
##    2: 2016-01-05      0 9.210853 9.210853      0   0    0 close 10000.00
##    3: 2016-01-06      0 9.210853 9.210853      0   0    0     0 10000.00
##    4: 2016-01-06      0 9.210853 9.210853      0   0    0     0 10000.00
##    5: 2016-01-07      0 9.210853 9.210853      0   0    0     0 10000.00
##   ---                                                                   
## 1238: 2018-07-03      0 9.378378 9.378378      0   0    0 close 11823.75
## 1239: 2018-07-04      0 9.378378 9.378378      0   0    1  sell 11823.75
## 1240: 2018-07-04      0 9.379157 9.379157      0   0    0 close 11832.96
## 1241: 2018-07-05      0 9.379157 9.379157      0   0    1  sell 11832.96
## 1242: 2018-07-05      0 9.379935 9.379935      0   0    0 close 11842.17
##         Profit      Bal
##    1: 0.000000 10000.00
##    2: 0.000000 10000.00
##    3: 0.000000 10000.00
##    4: 0.000000 10000.00
##    5: 0.000000 10000.00
##   ---                  
## 1238: 0.000000 13664.46
## 1239: 9.378378 13673.84
## 1240: 0.000000 13683.05
## 1241: 9.379157 13692.43
## 1242: 0.000000 13701.64

Table 5.2.1 : ROI for financial betting market.

Graph 5.2.1A : ROI for financial betting market. (Normal Kelly model)

Graph 5.2.1B : ROI for financial betting market. (Adjusted Kelly model 1)

Graph 5.2.1C : ROI for financial betting market. (Adjusted Kelly model 2)

Graph 5.2.1D : ROI for financial betting market. (Adjusted Kelly model 3)

Graph 5.2.1E : ROI for financial betting market. (Adjusted Kelly model 4)

Graph 5.2.1F : ROI for financial betting market. (Adjusted Kelly model 5)

Graph 5.2.1G : ROI for financial betting market. (Adjusted Kelly model 6)

6 Conclusion

I can use tick-data from FXCMTickData for this paper while it is weekly and not up-to-date, the getSymbols() able to get near real-time (15 minutes late) where suite for daily trading in Real Time Trading System (Trial) but there are a lot of trading date is not available.

Therefore High Frequency Trading in another research Real Time FXCM where all data price gathered from real-time will be accurate.

7 Appendix

7.1 Documenting File Creation

It’s useful to record some information about how your file was created.

[1] “2018-08-02 17:29:02 JST”
Category session_info
version R version 3.4.4 (2018-03-15)
system x86_64, linux-gnu
ui X11
language (EN)
collate C.UTF-8
tz Etc/UTC
date 2018-08-02
Category Sys.info
sysname Linux
release 4.4.0-111-generic
version #134~14.04.1-Ubuntu SMP Mon Jan 15 15:39:56 UTC 2018
nodename 6eca6a57da6a
machine x86_64
login unknown
user rstudio-user
effective_user rstudio-user

7.2 Reference

  1. Quant Strategies HFT
  2. Real Time FXCM
  3. Real Time Trading System (Trial)

Powered by - Copyright® Intellectual Property Rights of Scibrokes®個人の経営企業


  1. You are feel feel to get the data via FXCMTickData

  2. Kindly refer to GJR-GARCH 模型 for more information

  3. armaSearch() error on method = ‘ML’

  4. You may feel free to read the comment onto the article Q&A on Kelly criterion, stop-loss, take-profit and also leverage ratio as well.

  5. Publised book Data Science: Theories, Models, Algorithms, and Analytics - Sanjiv Ranjan Das (2017-03-24).

  6. The pips difference for USD/JPY different with others. Thereore there will need to compare among currencies (if generated higher ROI than normal Kelly model).

  7. Here I will conducting another research for financial betting.

  8. The odds price offered by operators do not same with the probabilities of the result, similar with the exchange rate offered by different operators will be difference as well.